RVF Modeling Pipeline

Overview

This is the main targets pipeline script for building predictive models for Rift Valley fever (RVF) outbreaks in South Africa. It downloads processed environmental and outbreak data, trains machine learning models, and generates performance reports.

Pipeline Components

1. Model Data Targets

Downloads and prepares the RVF model dataset from the RVF Data Processing Pipeline

2. Cross-Validation Targets

Nested Cross-Validation Approach

The RVF modeling pipeline implements a sophisticated nested cross-validation strategy that addresses both temporal and spatial dimensions:

Outer Loop (Expanding Window)

Inner Loop (Leave-One-Location-Out)

This nested approach provides robust estimates of model performance by evaluating:

  1. Temporal generalization through the expanding window
  2. Spatial generalization through leave-one-location-out validation
  3. Combined spatio-temporal performance across multiple scenarios

3. Model Tuning Targets

(To be implemented)

Handles hyperparameter optimization and model selection:

4. Model Fitting Targets

(To be implemented)

Will include:

5. Model Evaluation Targets

(To be implemented)

Planned evaluations:

Model Interpretability

(To be implemented)

6. Report Targets

(To be implemented)

Will generate:

7. Documentation Targets

(To be implemented)

Automatically generates project documentation:

Key Features